AITopics | Cundinamarca Department

Collaborating Authors

Cundinamarca Department

Learning Curves for Decision Making in Supervised Machine Learning: A Survey

arXiv.org Artificial IntelligenceJan-28-2025

Learning curves are a concept from social sciences that has been adopted in the context of machine learning to assess the performance of a learning algorithm with respect to a certain resource, e.g., the number of training examples or the number of training iterations. Learning curves have important applications in several machine learning contexts, most notably in data acquisition, early stopping of model training, and model selection. For instance, learning curves can be used to model the performance of the combination of an algorithm and its hyperparameter configuration, providing insights into their potential suitability at an early stage and often expediting the algorithm selection process. Various learning curve models have been proposed to use learning curves for decision making. Some of these models answer the binary decision question of whether a given algorithm at a certain budget will outperform a certain reference performance, whereas more complex models predict the entire learning curve of an algorithm. We contribute a framework that categorises learning curve approaches using three criteria: the decision-making situation they address, the intrinsic learning curve question they answer and the type of resources they use. We survey papers from the literature and classify them into this framework.

artificial intelligence, learner, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10994-024-06619-7

2201.1215

Country:

Europe > Netherlands > South Holland > Leiden (0.04)
Asia > Middle East > Jordan (0.04)
South America > Colombia > Cundinamarca Department (0.04)
North America > United States > Wisconsin (0.04)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.67)

Industry:

Education (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.45)
Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
(3 more...)

Add feedback

The Unreasonable Effectiveness Of Early Discarding After One Epoch In Neural Network Hyperparameter Optimization

Egele, Romain, Mohr, Felix, Viering, Tom, Balaprakash, Prasanna

arXiv.org Artificial IntelligenceApr-5-2024

To reach high performance with deep learning, hyperparameter optimization (HPO) is essential. This process is usually time-consuming due to costly evaluations of neural networks. Early discarding techniques limit the resources granted to unpromising candidates by observing the empirical learning curves and canceling neural network training as soon as the lack of competitiveness of a candidate becomes evident. Despite two decades of research, little is understood about the trade-off between the aggressiveness of discarding and the loss of predictive performance. Our paper studies this trade-off for several commonly used discarding techniques such as successive halving and learning curve extrapolation. Our surprising finding is that these commonly used techniques offer minimal to no added value compared to the simple strategy of discarding after a constant number of epochs of training. The chosen number of epochs depends mostly on the available compute budget. We call this approach i-Epoch (i being the constant number of epochs with which neural networks are trained) and suggest to assess the quality of early discarding techniques by comparing how their Pareto-Front (in consumed training epochs and predictive performance) complement the Pareto-Front of i-Epoch.

configuration, epoch, optimization, (15 more...)

arXiv.org Artificial Intelligence

2404.04111

Country:

North America > United States (0.46)
South America > Colombia > Cundinamarca Department (0.04)
South America > Colombia > Bogotá D.C. > Bogotá (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Energy (0.94)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Towards Green Automated Machine Learning: Status Quo and Future Directions

Tornede, Tanja, Tornede, Alexander, Hanselle, Jonas, Wever, Marcel, Mohr, Felix, Hüllermeier, Eyke

arXiv.org Artificial IntelligenceJun-13-2023

Automated machine learning (AutoML) strives for the automatic configuration of machine learning algorithms and their composition into an overall (software) solution - a machine learning pipeline - tailored to the learning task (dataset) at hand. Over the last decade, AutoML has developed into an independent research field with hundreds of contributions. At the same time, AutoML is being criticised for its high resource consumption as many approaches rely on the (costly) evaluation of many machine learning pipelines, as well as the expensive large scale experiments across many datasets and approaches. In the spirit of recent work on Green AI, this paper proposes Green AutoML, a paradigm to make the whole AutoML process more environmentally friendly. Therefore, we first elaborate on how to quantify the environmental footprint of an AutoML tool. Afterward, different strategies on how to design and benchmark an AutoML tool wrt. their "greenness", i.e. sustainability, are summarized. Finally, we elaborate on how to be transparent about the environmental footprint and what kind of research incentives could direct the community into a more sustainable AutoML research direction. Additionally, we propose a sustainability checklist to be attached to every AutoML paper featuring all core aspects of Green AutoML.

artificial intelligence, machine learning, proceedings, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.1.14340

2111.0585

Country:

South America > Colombia > Cundinamarca Department (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (1.00)

Industry: Energy (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Towards Green Automated Machine Learning: Status Quo and Future Directions

Tornede, Tanja (a:1:{s:5:"en_US";s:20:"Paderborn University";}) | Tornede, Alexander | Hanselle, Jonas | Mohr, Felix | Wever, Marcel | Hüllermeier, Eyke

Journal of Artificial Intelligence ResearchJun-12-2023

Automated machine learning (AutoML) strives for the automatic configuration of machine learning algorithms and their composition into an overall (software) solution — a machine learning pipeline — tailored to the learning task (dataset) at hand. Over the last decade, AutoML has developed into an independent research field with hundreds of contributions. At the same time, AutoML is being criticized for its high resource consumption as many approaches rely on the (costly) evaluation of many machine learning pipelines, as well as the expensive large-scale experiments across many datasets and approaches. In the spirit of recent work on Green AI, this paper proposes Green AutoML, a paradigm to make the whole AutoML process more environmentally friendly. Therefore, we first elaborate on how to quantify the environmental footprint of an AutoML tool. Afterward, different strategies on how to design and benchmark an AutoML tool w.r.t. their “greenness”, i.e., sustainability, are summarized. Finally, we elaborate on how to be transparent about the environmental footprint and what kind of research incentives could direct the community in a more sustainable AutoML research direction. As part of this, we propose a sustainability checklist to be attached to every AutoML paper featuring all core aspects of Green AutoML.

automl, footprint, proceedings, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.14340

AI Access Foundation

14340

Journal of Artificial Intelligence Research

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
South America > Colombia > Cundinamarca Department (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report (0.67)
Overview (0.67)

Industry: Energy (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

PyExperimenter: Easily distribute experiments and track results

Tornede, Tanja, Tornede, Alexander, Fehring, Lukas, Gehring, Lukas, Graf, Helena, Hanselle, Jonas, Mohr, Felix, Wever, Marcel

arXiv.org Artificial IntelligenceApr-21-2023

It is intended to be used by researchers in the field of artificial intelligence, but is not limited to those. The empirical analysis of algorithms is often accompanied by the execution of algorithms for different inputs and variants of the algorithms, specified via parameters, and the measurement of non-functional properties. Since the individual evaluations are usually independent, the evaluation can be performed in a distributed manner on an HPC system. However, setting up, documenting, and evaluating the results of such a study is often file-based. Usually, this requires extensive manual work to create configuration files for the inputs or to read and aggregate measured results from a report file.

artificial intelligence, experiment, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.21105/joss.05149

2301.06348

Country:

South America > Colombia > Cundinamarca Department (0.05)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.32)

Add feedback

Automated Machine Learning, Bounded Rationality, and Rational Metareasoning

Hüllermeier, Eyke, Mohr, Felix, Tornede, Alexander, Wever, Marcel

arXiv.org Artificial IntelligenceSep-10-2021

The notion of bounded rationality originated from the insight that perfectly rational behavior cannot be realized by agents with limited cognitive or computational resources. Research on bounded rationality, mainly initiated by Herbert Simon, has a longstanding tradition in economics and the social sciences, but also plays a major role in modern AI and intelligent agent design. Taking actions under bounded resources requires an agent to reflect on how to use these resources in an optimal way - hence, to reason and make decisions on a meta-level. In this paper, we will look at automated machine learning (AutoML) and related problems from the perspective of bounded rationality, essentially viewing an AutoML tool as an agent that has to train a model on a given set of data, and the search for a good way of doing so (a suitable "ML pipeline") as deliberation on a meta-level.

agent, rationality, tornede, (13 more...)

arXiv.org Artificial Intelligence

2109.04744

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(8 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.86)

Add feedback

Run2Survive: A Decision-theoretic Approach to Algorithm Selection based on Survival Analysis

Tornede, Alexander, Wever, Marcel, Werner, Stefan, Mohr, Felix, Hüllermeier, Eyke

arXiv.org Machine LearningJul-10-2020

Algorithm selection (AS) deals with the automatic selection of an algorithm from a fixed set of candidate algorithms most suitable for a specific instance of an algorithmic problem class, where "suitability" often refers to an algorithm's runtime. Due to possibly extremely long runtimes of candidate algorithms, training data for algorithm selection models is usually generated under time constraints in the sense that not all algorithms are run to completion on all instances. Thus, training data usually comprises censored information, as the true runtime of algorithms timed out remains unknown. However, many standard AS approaches are not able to handle such information in a proper way. On the other side, survival analysis (SA) naturally supports censored data and offers appropriate ways to use such data for learning distributional models of algorithm runtime, as we demonstrate in this work. We leverage such models as a basis of a sophisticated decision-theoretic approach to algorithm selection, which we dub Run2Survive. Moreover, taking advantage of a framework of this kind, we advocate a risk-averse approach to algorithm selection, in which the avoidance of a timeout is given high priority. In an extensive experimental study with the standard benchmark ASlib, our approach is shown to be highly competitive and in many cases even superior to state-of-the-art AS approaches.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2007.02816

Country:

Europe > Germany (0.04)
South America > Colombia > Cundinamarca Department (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Law > Civil Rights & Constitutional Law (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback